Construction of a fiber-optically connected MEG hyperscanning system for recording brain activity during real-time communication

Communication is one of the most important abilities in human society, which makes clarification of brain functions that underlie communication of great importance to cognitive neuroscience. To investigate the rapidly changing cortical-level brain activity underlying communication, a hyperscanning system with both high temporal and spatial resolution is extremely desirable. The modality of magnetoencephalography (MEG) would be ideal, but MEG hyperscanning systems suitable for communication studies remain rare. Here, we report the establishment of an MEG hyperscanning system that is optimized for natural, real-time, face-to-face communication between two adults in sitting positions. Two MEG systems, which are installed 500m away from each other, were directly connected with fiber optic cables. The number of intermediate devices was minimized, enabling transmission of trigger and auditory signals with almost no delay (1.95–3.90 μs and 3 ms, respectively). Additionally, video signals were transmitted at the lowest latency ever reported (60–100 ms). We furthermore verified the function of an auditory delay line to synchronize the audio with the video signals. This system is thus optimized for natural face-to-face communication, and additionally, music-based communication which requires higher temporal accuracy is also possible via audio-only transmission. Owing to the high temporal and spatial resolution of MEG, our system offers a unique advantage over existing hyperscanning modalities of EEG, fNIRS, or fMRI. It provides novel neuroscientific methodology to investigate communication and other forms of social interaction, and could potentially aid in the development of novel medications or interventions for communication disorders.

Introduction (ADL) which permits synchronization of the transmitted audio and video signals. Here, we describe the constitution of the MEG hyperscanning system, and methods and results of evaluation of its audiovisual latencies.

Fiber optics and MEGs
Our MEG hyperscanning system was constructed by connecting two MEGs installed at Hokkaido University Medical and Dental Research Building (site A) and Hokkaido University Hospital (site B) using 473 m of fiber optic cables (Fig 1). Transistor-transistor logic (TTL) signals were used to verify transmission latency between the two MEG devices. The TTL signals were produced by a PC installed at site A, and transmitted to the MEG data acquisition systems (MEG Acqs) at both sites. The MEG hyperscanning system had an audio/visual (A/V) transmission system, which facilitated realistic, face-to-face communication between participants at the two sites. The video system was unified to 1080p/60p. Audio signals were synchronized with video signals using the ADL.

TTL setup
We used TTL signals to match the timing of measurements between the two sites. TTL signals were transmitted from site A to site B as follows. 4. The transmitted signals were decoded into electrical TTL signals using an identical conversion module at site B.
5. The decoded electrical TTL signals were received by the MEG Acqs at site B.

Video setup
To ensure the potential to visualize small changes in the facial expressions of future participants, the video systems needed high resolutions and frame rates. Here, progressive scanning has advantages for motion recording and playing. Therefore, video signal transmission was unified to 1080p/60p. Video signals are transmitted from one site to the other site as follows: 1. At one site, video signals are sampled by an HD camera (GP-KH232A, Panasonic) in a shielded room, transmitted to the HD camera control unit outside the shielded room via a 15 m cable, and converted into HDMI signals.

Audio setup (with ADL synchronizaton)
As video signals are output in units of frames, the latency of video signals strongly depends on the presentation time of each frame. In contrast, audio signals are transmitted without frames; as a result, audio signals were expected to be transmitted more rapidly than video signals (See Results in detail). Therefore, to adjust the latencies of the audio signals, we additionally tested their output via ADL. Audio signals are transmitted from one site to the other site using ADL as follows: 6. The decoded signals are transmitted to the A/V mixer via ADL (ADL-40, Imagenics).
7. The audio signals are played on a non-magnetic speaker (Audio Element N-20 in SSHP60X20, Panphonics) via the A/V mixer.

TTL latency measurement
The standard signaling latency between the two sites was defined by a TTL signal. The latency of the TTL signal, which consists of durations of conversion and transmission (Fig 1), was measured as follows. A TTL signal generated by a PC at site A was recorded by a digital oscilloscope (Advantest, R9211E digital spectrum analyzer) at site A after a round trip to site B (loop back condition). The same TTL signal was directly recorded by the same digital oscilloscope without the round trip (direct condition). The time difference between those two conditions was evaluated. Half of the time difference was defined as the TTL signal latency. The sampling frequency of the digital oscilloscope was set 256.41 kHz (3.90 μs). The TTL signals were transmitted and recorded 100 times to confirm the reproducibility of our latency measurement.

Video and audio latency measurement overview
The latency of video or audio signals caused by conversion, transmission, and passage of the signal through all intermediate devices.
To measure the latencies of the video or audio signal from one site to the other, references were required to know the onset time of the signal. The reference signals also had inherent latencies. Therefore, the latencies of the reference signals were also evaluated. Latencies were measured 100 times with a digital oscilloscope and averaged. Jitter was observable as distortion in the averaged latency. When evaluating jitter, latencies derived via the MEG Acqs at a sampling rate of 1,000 Hz were analyzed to determine their mode, average, variance, and range.

Video latency measurement
Flashing LED lights and photodiodes to detect them were used to measure the latency of the video signal.
Reference signal. The reference signal for video was a square signal which was generated by the output of the photodiode detecting the LED light. The square signal was directly transmitted from one site to the other site via the same pathway as the TTL signal described above, and input into the MEG Acqs at the receiver site. The sampling rate was 1,000 Hz.
Measurement signal. The LED light was flashed 200 times over five sessions (total 1,000 times) at site A. The light was captured by the video camera at site A, and transmitted via all intermediate devices to site B, where the light was projected into the shielded room and detected by a photo diode. A square signal was generated by the output of the photodiode and input into the MEG Acqs at site B. This measurement process was performed in the opposite direction as well.

Audio latency measurement and adjustment
Sine waves (250 Hz, 100 ms, 5 ms rise/fall) generated by a PC were used to measure the latency of the audio signal.
Reference signal. The reference signal for audio transmission was a sine wave generated by a PC at site A. It was recorded by a digital oscilloscope at site A after a round trip to site B via optical analogue link (Transmitter, PE-1800TAF, Optex; Receiver, PE-1800RAF, Optex).
The loop-backed sine wave was compared with the original one on the same digital oscilloscope at site A.
Measurement signal. The sine wave signal generated by the PC at site A was split. One part was transmitted directly to site B and recorded on a digital oscilloscope. The other part was played on a non-magnetic speaker and sampled by a monaural microphone in the shielded room at site A. The signal captured by this microphone was then transmitted to site B where it, underwent digital audio conversion and passed through the A/V mixer. It was then re-played on a non-magnetic speaker and sampled by a monaural microphone in the shielded room at site B. The sampled signal was recorded on the same digital oscilloscope at site B. The audio waves of the two split signals were compared. This measurement process was performed in the opposite direction as well.
Audio latency adjustment. After determining the latencies of the audio and video signals, the latency of the audio signal was adjusted to the latency of the video signal by ADL, as appropriate. The minimum adjustment width of ADL was 1 ms.

Electrophysiological experiment
One pair of subjects (23 year-old female and 25 year-old male) participated. Signed informed consent was obtained from both subjects before the experiment. The MEG recordings were approved by the Ethics Review Board of the Graduate School of Medicine at Hokkaido University.
The two subjects faced each other via the A/V devices and spoke words in turns according to timed cues. The speech audio signals from each site were transmitted to the opposite site with a 90 ms delay using the ADL to align them with the visual signal delay. MEGs were recorded during 128 speech exchanges of this alternate speaking protocol. The amplitude modulations of the alpha-band rhythms across all 128 exchanges were averaged and then normalized in each subject based on their average alpha amplitude over the period from -2,000 ms to -1,000 ms prior to the speech onset of the other subject (S2 File). The resulting normalized mean alpha activity was then mapped onto template brains. Data analysis was performed with Brainstorm [34].

TTL latency
The time difference between the loop back and direct conditions were recorded by the digital oscilloscope as 7.80 μs for all signals (S1 Fig). Given that the sampling rate of the digital oscilloscope was 3.90 μs, this means that the signal latency of the loop back condition was later than 3.90 μs and shorter than 7.80 μs. Therefore, the latency of the direct condition, which was considered to be half the latency of the loop back condition, was evaluated to be 1.95-3.90 μs. No jitter was observed within this time resolution. The theoretical latency of the TTL signals was 2.88 μs, which is the sum of the integral of the speed of light over the transmission distance of 472 m (1.58 μs) and the time required for conversion (1.30 μs) by optical I/O module A ( Fig  1). Thus, our measured latency coincides the theoretical one, and is much smaller than the highest temporal resolution of the MEG Acqs (1 ms at 1,000 Hz sampling).

Video latency
Reference signal. Our evaluations revealed that it took 11.61 μs to generate the square signal from the output of the photodiode. Therefore, the latency of the reference signal was the sum of this delay of 11.61 μs and the direct latency of the TTL signal (1.95-3.90 μs). Effectively, the latency of the reference signal was negligibly short compared to the measurement signal.
Measurement signal. The latencies of the 1,000 LED light flash at both sites are summarized on a histogram with 2-ms bins (Fig 2, site A blue bars, site B red bars, S1 File). From site A to site B, the mode was 70-72 ms (mean = 76.76 ms, SD = 5.34 ms, and range = 66.42-97.42 ms); from site B to site A, the mode was 76-78 ms (mean = 76.94 ms, SD = 7.61 ms, and range = 63.42-95.42 ms). Here, the transmission takes 2.36 μs, which was calculated by the transmission speed of the HDMI cable 0.5 μs/100 m and cable length of 472 m, and conversion takes 400 μs in total (200 μs for optic I/O module B and C in Fig 1). As both sites had the same devices and set-ups, the latency distributions were nearly identical, ranging from 60 ms to 100 ms. These latencies are sufficiently short for natural communication, thereby meeting the objective of this system.
The processing latency of one of the intermediate devices, the A/V mixer, is 16.67 ms/ frame. This latency is small compared to the mean overall latency of about 77 ms. Hence, the majority of the video latency is implicitly caused by the camera and the projector. The signal transmission of the camera and the projector are 1080p/60p, i.e., one frame equates to 16.67 ms. Jitter was presumed to be caused by the frame of both the camera and the projector, and was therefore calculated as 33.34 ms (16.67 ms × 2 devices). The latency range of our measurement results (about 31.5 ms) closely coincides with this value.

Audio latency and synchronization with video
Reference signal. The loop-backed sine wave was compared with the original one on the digital oscilloscope at site A. The latency, calculated as the half of the difference between the two waves was 202.4 μs. This latency was negligibly short compared to the measurement signal. There was no distortion of the sine wave, based on visual inspection, thus indicating an absence of jitter.
Measurement signal. A comparison of the audio waves of the two split signals demonstrated a constant latency of 3.13 ms (from Site A to Site B) and 2.78 ms (from Site B to Site A) with no jitter (Fig 2, red and blue bar, S2 Fig). The reason for the slight directional difference is not clear, but we suspect that it might depend on the distance between the microphone and the speaker at each site. Regardless, the minute directional difference (0.4 ms) is arguably not physiologically discernible, and the approximately 3 ms jitter-free latency in both directions is sufficiently low for natural communication.
Audio latency adjustment. Audio signals from one site arrive at the other site about 74 ms earlier than the video signal. As mentioned previously, this situation is known to cause discomfort when viewing video [31,32]. To correct this, and ensure that our system can be comfortably used for real-time audiovisual communication, ADL was used to increase the latencies of the audio signals to make them arrive just after the video signals. The ADL was set such that the audio signal latencies were increased by 90 ms, which is approximately two standard deviations above the mean video signal latency (76.85 ± 6.57 ms). Consequently, the ADL-adjusted audio signals had a mean latency of 93 ms (Fig 2, white bar). Fig 3 shows normalized mean alpha-band amplitude modulation across all 128 speech exchanges for each subject averaged across the entire cortical surface (Upper), and that across both subjects and mapped onto the template brain (Lower). The brain activity of the subjects at both sites reflects that which is associated with listening, with time point 0 ms being the moment of speech onset of the opposite party. Alpha-band desynchronization was exhibited in both the site A and site B subject during listening. Notably, the desynchronization appears to have commenced before the speech onset of the opposite party, a sign that subjects could visually predict the onset of the opposite party's speech. The suppression was primarily concentrated in occipital and left temporal regions, indicating functional involvement of both the visual and auditory systems, and suggesting that each subject could visually predict the onset of the opposite party's speech.

Discussion
We established an MEG hyperscanning system with an audiovisual interface capable of permitting real-time, face-to-face communication between two adults, and verified its TTL signaling and audiovisual transmission latency. The latency of the TTL signal (trigger) was orders of magnitude lower than the maximum temporal resolution of our MEG devices, essentially demonstrating simultaneous and synchronous recording onset for both MEG devices. Site-tosite audio signal latency was about 3 ms, in either direction, which is on par with the speed of transmission of telephone landline audio signals [25]. Moreover, audio latency was completely jitter free, and well below reported thresholds for human detection of musical quality deterioration, indicating that our system would additionally be suitable for communication paradigms based on musical stimuli [33]. Finally, the video signals had short latencies (60-100 ms) and small jitter (SD: 6.57 ms). We also conducted an electrophysiological study and confirmed that this hyperscanning system can reliably transmit A/V information and measure physiological signals.
The latencies and jitter values recorded here are the smallest ever reported for an MEG hyperscanning system. The additional verification of audio synchronization to video signals via ADL is another achievement that has hitherto not been reported. The only other existing MEG hyperscanning system that might have comparable video delay is one reported by Hirata et al. [35]. That system comprises two MEGs co-located in one shielded room, with one MEG designed for adults, and the other designed for infants or small children, thus permitting parent-child hyperscanning. The co-location of the MEGs in the same room allows the audio communication to be transmitted directly through the air. However, the two MEGs are designed for recording subjects in supine positions, and thus facial communication with their system has been accomplished similarly to us with video signals transmitted via cameras and projectors. Correspondingly, although the exact amount has not been reported, the co-located MEG hyperscanning system reported by Hirata et al. must certainly have delay in the video signals. Furthermore, the co-location of the subjects in the same room and their auditory communication through air not only means that auditory signals likely precede video signals, but also that the audio cannot be isolated and properly synchronized to the video signals. Finally, their system is limited in that hyperscanning can only be performed between an adult and a child. Our MEG hyperscanning system realizes real-time video and audio communication between two adults, and uses a more natural, face-to-face, seated orientation (Fig 1). Combined with the extremely low video latency and audio-video synchronization, our system should permit natural conversation. See the S1 Appendix for information about ways that latency and jitter could be reduced even further.
As MEG is silent, and completely non-invasive, our system should permit cortical-level investigation into numerous kinds of subtle and dynamic brain processes which occur during natural two-way communication. For example, our system could be used to measure cortical brain response associated with changes in speech patterns and facial expressions between the participating subjects. The ability to measure this is important as brain responses during dynamic realtime conversation may be quite different than isolated event-related responses. Indeed, consider that the N400 event-related potential component associated with semantic processing of a single word is generally observed about 400 ms after the word is presented [36]. In contrast, responses in everyday conversation have been reported to occur in as little as 200 ms after a conversation partner's speech onset [37]. In addition, a prominent response in the occipital cortex to another's blink has been observed at 250 ms, and this brain response is positively correlated with empathic concern in the viewer [38,39]. These kinds of fast brain responses that occur back and forth in real-time communication likely have neurocorrelates in both the sender and the receiver, and thus require high temporal resolution hyperscanning to adequately capture. Moreover, it is important to recognize that in natural, two-way communication, both parties alternate between being the sender and the receiver of auditory and visual information, and the brain regions involved when sending (inferior parietal lobule/sulcus, ventral premotor cortex) and receiving (ventral medial prefrontal cortex) communication are different [40]. Therefore, high spatial resolution is also very important in a hyperscanning system, thereby making MEG a preferable modality for investigating the neurocorrelates of natural communication.
Finally, we would like to highlight the importance of the intermediate devices used to transmit/receive audiovisual signals in hyperscanning systems. The quality of these devices and the validation of their signal processing latencies and characteristics is essential for realizing wellcontrolled experimental designs in neuropsychophysiological experimentation. Moreover, the minimization of the latency through these intermediate devices, such as via a direct fiber optic connection, is a fundamental priority for hyperscanning research protocols in any modality, not only MEG.
Comprehensively, the establishment and verification of our new MEG hyperscanning system opens the door to a new line of neuroimaging research regarding human communication. Future studies employing our system may shed light on the pathophysiology of neurological and psychiatric disorders that manifest with communication deficits, and inspire development of novel medications or interventions.